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ABSTRACT 


Coronaviruses (CoVs) demonstrate great potential for interspecies transmission, including 
zoonotic outbreaks. Although bovine coronavirus (BCoV) strains are frequently circulating in 
cattle farms worldwide, causing both enteric and respiratory disease, little is known about their 
genomic evolution. We sequenced and analyzed the full-length spike (S) protein gene of thirty- 
three BCoV strains from dairy and feedlot farms 2002 to 2010 in Sweden and Denmark. Amino 
acid (aa) identities were >97% for the BCoV strains analyzed in this work. These strains formed 
a clade together with Italian BCoV strains and highly similar to human enteric coronavirus 
HECV-4408/US/94. A high similarity was observed between BCoV, canine respiratory 
coronavirus (CRCoV) and human coronavirus OC43 (HCoV-OC43). Molecular clock analysis of 
the S gene sequences dated a common ancestor of BCoV and CRCoV to 1951, while a common 
ancestor of BCoV and HCoV-OC43 was dated to 1899. BCoV strains showed the lowest 
similarity to equine coronavirus (ECoV) placing the date of divergence at the end of 18" century. 
Two strongly positive selection sites were detected along the receptor binding subunit of S 
protein gene; spanning aa residues 109-131 and 495-527. On the contrary, the fusion subunit was 
observed to be under negative selection. Selection pattern along S glycoprotein implies adaptive 
evolution of BCoVs, suggesting a successful mechanism for BCoV to continuously circulate 


among cattle and other ruminants without disappearance. 
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INTRODUCTION 


Bovine coronavirus (BCoV) is a member of the Coronaviridae family, order Nidovirales 
(Cavanagh, 1997). Coronaviruses (CoVs) possess the largest viral RNA genome in nature. 
Recently, the International Committee for Taxonomy of Viruses (ICTV) has proposed two sub- 
families for Coronaviridae: Coronavirinae and Torovirinae, the former comprising three groups 
but renamed as Alphacoronavirus, Betacoronavirus, and Gammacoronavirus, respectively (de 
Groot et al., 2012) and with a novel (but yet to be approved) genus, provisionally named 
Deltacoronavirus (Woo et al., 2012). Four separate lineages (A through D), some of them 
encompassing multiple virus species, are commonly recognized within the genus 
Betacoronavirus. BCoV, together with human coronavirus OC43 (HCoV-OC43), equine 
coronavirus (ECoV) and porcine hemagglutinating encephalomyelitis virus (PHEV) belongs to 
the virus species Betacoronavirus! of the lineage A of the genus Betacoronavirus (de Groot et 
al., 2012). A recently isolated canine respiratory coronavirus (CRCoV) has also shown a high 


genetic similarity to Betacoronavirus! (Erles et al., 2007). 


BCoV is an enveloped virus with a single-stranded, positive-sense, non-segmented RNA genome 
of approximately 31 kb (Clark, 1993). A 4092 nucleotide (nt) fragment of BCoV genome 
encodes the large petal-shaped surface spike (S) protein. This is a type | membrane glycoprotein 
of 1363 amino acids that comprises two hydrophobic regions, an amino-terminal (N-terminal) 
signal sequence and carboxyl-terminal (C-terminal) membrane anchor (Parker et al., 1990). The 
S protein is cleaved by an intracellular protease between aa 768 and 769 to form two functionally 


distinct subunit domains, a variable S1 N-terminal domain and the more conserved S2 C- 
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terminal domain (Abraham et al., 1990). The S1 subunit is a peripheral protein, mediating virus 
binding to host-cell receptors (Li, 2012; Peng et al., 2012), haemagglutinating activity (Schultze 
et al., 1991) and inducing neutralizing antibodies (Yoo & Deregt, 2001). The S2 subunit is a 
transmembrane protein which mediates fusion of viral and cellular membranes (Yoo et al., 


1991a). 


BCoV is the causative agent of neonatal calf diarrhea (CD), winter dysentery (WD) in adult 
cattle (Alenius et al., 1991; Mebus et al., 1973; Saif et al., 1988), and respiratory tract disorders 
in cattle of all ages (Cho et al., 2001; Decaro et al., 2008a; Lathrop et al., 2000). This infection is 
not effectively controlled in the herds by current commercial vaccines (Saif, 2010). BCoV 
negatively impacts cattle industry due to reduced milk production, loss of body condition and 
also through the death of young animals (Clark, 1993; Saif, 2010). BCoV outbreaks most often 
happen during fall and winter (Clark, 1993). However, studies from various climate regions have 
also reported BCoV outbreaks in the warmer seasons (Bidokhti ef al., 2012; Decaro et al., 


2008b; Park et al., 2006). 


Studies have shown high prevalence of BCoV infections in cattle farms in many countries 
(Fulton et al., 2011; Paton et al., 1998; Saif, 2010; Travén et al., 2001). Also BCoV-like 
coronaviruses transmissible to gnotobiotic calves have been found among various wild ruminants 
(Alekseev et al., 2008; Tsunemitsu et al., 1995). The public health impact of BCoVs has also 
been raised due to the isolation of a BCoV-like human enteric coronavirus — 4408/US/94 
(HECV-4408/US/94) from a child with acute diarrhoea (Zhang et al., 1994), and also the 


outbreaks of severe acute respiratory syndrome CoV (SARS-CoV) (Groneberg ef al., 2003; 
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Zhong & Wong, 2004). Molecular evolutionary analysis of HCoV-OC43 isolates suggests BCoV 
as their genetically closest counterpart compared to other CoV species (Vijgen ef al., 2006). 
Recently, a novel coronavirus HCoV-EMC was found that has been circulating in the Middle 
East and caused death with similar clinical signs to SARS-CoV (Al-Ahdal et al., 2012; Zaki et 
al., 2012). Such veterinary and public health concerns rationalize the study of the genetic 


diversity and evolution of BCoV strains and their relationship with the other Betacoronaviruses. 


The S gene sequence of BCoV has been exploited for epidemiological (Bidokhti et al., 2012; 
Decaro et al., 2008c; Hasoksuz et al., 2002; Jeong et al., 2005; Lathrop et al., 2000; Liu et al., 
2006; Martinez et al., 2012) and evolutionary (Vijgen et al., 2005b; Woo et al., 2012) studies. So 
far, no study has systematically defined the positive selection pattern of the S protein of BCoV 
strains which is probably important for BCoV adaptive evolution. In the present study, to better 
understand the epidemiologic dynamics of BCoV and to investigate the adaptive evolutionary 
process of BCoVs, we sequenced the full-length S gene and analyzed molecular epidemiology, 
evolution and selective pressures of this virus in cattle herds in Sweden and Denmark. Reference 
strains from other hosts in Betacoronavirus1 including human, wild ruminants, pig and horse and 
also CRCoV from dog were included in this analysis to estimate their time of divergence and 


update their genetic relationship. 
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RESULTS 


Sequence data and genome analysis 


Comparative analysis of the S gene (4092 nt) indicated that all 33 Swedish and Danish strains 
(GenBank accession numbers: KF169908-KF169940) shared a high degree of sequence identity 
both at nt level (597.8%) and deduced aa level (>97.4%). Compared with the 
BCoV/Mebus/US/72 strain, 78 to 113 nt substitutions (97.2% to 97.9% sequence identity) were 
found resulting in 37-54 aa changes (96% to 97.2% sequence identity) within the entire S gene of 
the strains. The 100% identical strains SWE/I/07-3, SWE/I/07-4 and SWE/I/07-5 from Sweden 
were found to be 99.7% similar to the strain SWE/P/09-1. SWE/I/07-3 and SWE/I/07-4 were 
obtained from different cows with enteric disease in the same herd in Gotland island in south- 
eastern Sweden. SWE/I/07-5 was obtained from another herd in Gotland island during the same 
time. SWE/P/09-1 was obtained from a cow with respiratory disease in a herd in south-western 


Sweden. 


SWE/N/05-1 and SWE/N/05-2 showing 8 nt substitutions (99.8% identity) were sampled from 
different calves with enteric disorders at the same occasion in a large dairy herd. The oldest 
strain, SWE/C/92 showed the highest identity (nt 98.7%, aa 98.7%) to an old strain, DEN/03-3, 
and the lowest identity (nt 97.8%, aa 97.4%) to a recent strain, SWE/M/10-1. SWE/Y/10-3 from 
northern Sweden and SWE/P/10-4 from south-western Sweden showed 99.9% nt identity. These 


strains were obtained during the same year from different regions. 
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The analysis of the predicted S proteins of the present 33 BCoV strains revealed a potential N- 
terminal signal peptide of about 14 amino acids by SignalP-HMM and SignalP-NN, respectively. 
A potential S1/S2 cleavage site located after RRSRR, identical for BCoV (Abraham et al., 1990) 
and some HCoV-OC43 (Lau et al., 2011), was identified in the S proteins of all strains excluding 
the 2010 strains. The R-to-K aa change in the 764 position, leading to a KRSRR motif, was 
observed in the S proteins of SWE/Y/10-3 and SWE/P/10-4. The A-to-E aa change in the 769 
position, leading toa RRSRRE motif, was observed downstream of the potential cleavage site in 
the S proteins of SWE/M/10-1 and SWE/M/10-2. It has been suggested that changes in the last 
position of the motif affect the S protein cleavability (Viygen et al., 2005a). This cleavage 
process is believed to play an important role in the fusion activity and viral infectivity of BCoV 
(Storz et al., 1981; Vijgen et al., 2005a). More sequence data and experimental studies are 
required to clarify the important role of these changes in the cleavage site of BCoV. The analysis 
of the S protein showed 20 potential N-linked glycosylation sites in all Swedish and Danish 
BCoV strains, with nine NXS (T133, M359, V437, P444, S696, D788, F895, 11234, Q1288) and 


eleven NXT (T59, F198, A649, R676, N714, S739, C937, N1194, Y1224, Q1253, V1267) sites. 


Phylogenetic tree 


The analyzed samples showed low variability. Within the 4092 nt of the complete sequences of 
the S protein gene, 340 nt were variable (8.3%). At the aa level the variation was slightly larger 
(147 variable aa residues, 10.8%). Nucleotide p-distances among strains ranged between 0.1 and 
2.7%. This high degree of sequence identity is reflected in the NJ tree (Fig. 1): all Swedish and 


Danish strains from 2002 to 2010 clustered together as a unique clade with Italian strains; 
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BuCoV/ITA/179-07-11, BCoV/438/06-2/ITA and BCoV/ITA/339/06. The oldest Swedish strain 
SWE/C/92 was branched away from this clade and clustered into a separate clade with 
BCoV/GER/M80844/89 and human isolate HECV-4408/US/94. The remaining reference strains 
derived from cattle and wild ruminants clustered irrespective of the host. The CRCoV clade was 
most closely related to the BCoV and BCoV-like coronavirus clade; while HCoV-OC43, PHEV 


and ECoV clusters were more distant (Fig. 1). 


Fifty-three nt differences were found between strains SWE/M/06-3 and SWE/M/06-4 (98.7% nt 
similarity, 98.1% aa similarity). These strains were obtained from two dairy herds with CD 
symptoms sampled at the same time in southern Sweden. SWE/M/06-3 clustered with 
SWE/AC/08-1, SWE/E/08-2, SWE/Z/07-1, SWE/C/07-2, SWE/C/07-6 and SWE/U/09-3 (Fig.1), 


sharing more than 99.4% sequence similarity. 


Evolutionary rate and estimation of divergence dates 


Molecular clock analysis of Swedish and Danish BCoV strains and reference strains of 
Betacoronavirus! using Bayesian coalescent approach was performed to estimate their mean rate 
of evolution and their time to the most recent common ancestor (TMRCA) which are shown in 
detail in Table 3. TMRCA of CRCoV and BCoV was dated to 1951. The mean evolution rate of 
Swedish and Danish BCoV strains compared to CRCoV was also estimated 4.4x10~ substitution 
per site per year. TMRCA analysis estimated earlier divergence of BCoV strains from HCoV- 
OC43 (1899), PHEV (1847) and ECoV (1797). The mean evolution rate of Swedish and Danish 


BCoV strains compared to HCoV-OC43 was 4.1x10~ substitutions per site per year, 7.6x107 
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compared to PHEV and 7.9x107 compared to ECoV. TMRCA of BCoV compared to CoVs 
from wild ruminants was dated to 1963 and the mean rate of evolution was estimated to be 
4.4x10™ substitution per site per year. Swedish and Danish BCoV strains sequenced in this study 
showed the highest mean rate of evolution to BCoV reference strains and HECV-4408/US/94; 
8.7x10* and 8.3x10~ substitution per site per year, respectively. This resulted in estimating 


almost the same year for TMRCA, 1978 and 1977, respectively (Table 3). 


Results from bootscan analysis were in line with the observations described above and in 
phylogenetic tree (Fig. 1). Bootscan analysis showed a number of possible recombination sites 
when the S gene of BCoV strains were used as the query. Most of the region exhibits higher 
bootstrap support for the clustering of strains BCoV with CRCoV, except upstream of position 
500, where higher bootstrap support for clustering with strains HCoV-OC43 was observed. 
Similar results were obtained when strains CRCoV were subjected to bootscan analysis (Fig. 
S1). When the S gene of HCoV-OC43 strains were used as the query, downstream of position 
1800 exhibits higher bootstrap support for the clustering of strains HCoV-OC43 with PHEV. 


Similar results were obtained when strains PHEV were subjected to bootscan analysis (Fig. $1). 


Selective pressure sites 


The selection profiles of the aa sequence of all 33 Swedish and Danish BCoV strains showed two 
general patterns within the S protein. The cumulative dN-dS revealed that aa residues 109-131 
and 495-527 of the S1 subunit were under strong positive selection (Fig. 2a). Amino acid 


residues 36-97, 315-420, 498-713, 910-1032, 1059-1234 and 1245-1279 were under negative 
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selection. They covered most of the S2 subunit, indicating that S2 is relatively stable in BCoV 


(Fig. 2a). 


The SNAP analysis identified 133 positively selected sites. 89 of them are in SI and 44 in S2 
domain (Fig. 2b). Several of these sites were also identified by the REL method at posterior 
probability p > 90% level. The following positive selection sites were identified by SNAP and 
REL methods: 35, 112, 113, 115, 143, 147, 151, 157, 188, 257, 447, 458, 471, 482, 499, 501, 


503, 510, 523, 525, 543, 546, 573, 578, 590, 596, 718, 722, 888, and 1239 (Table 4). 


Protein modelling comparisons 


To determine if a homology model of the S protein for HEC V-4408/US/94, SWE/C/92, DEN/03- 
3, SWE/M/10-1 and GER/V270/83 could be generated, each of these five sequences were 
searched individually against the Protein Data Bank (PDB) entries 
(http://www.rcsb.org/pdb/home/home.do) using default parameters. Based on the Z-score, all of 
these S protein sequences of BCoVs had the highest structural similarity to the crystal structure 
of murine hepatitis virus (PBD ID: 3R4D). Notably, the S1 sequences of the 33 BCoV strains 
contain a putative receptor binding domain (aa residues 326 to 540, Fig. 2) with 94.8 to 97.6% aa 
identities to sequences of BCoV/Mebus/US/72 and GER/V270/83. This part of the BCoV S 
proteins had the highest sequence similarity of the SARS receptor-binding domain- like 
superfamily (Scop ID: 143587), spanning aa residues 328-493 of the S protein of SARS; the so 
called C-domain (Wong et al., 2004). Sialic acid is known to be the receptor for S protein 


binding in BCoV, although the receptor-binding domain is not well defined (Schultze ef al., 
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1991). The BCoV S protein also contains a N-terminal domain (NTD) spanning aa residues 15 to 
298, as recently defined in detail (Peng et al., 2012), with 92.9 to 95% aa identities to sequences 


of BCoV/Mebus/US/72 and GER/V270/83. 


Default parameters were used in I-TASSER to predict structures of these proteins as explained in 
the Materials and Methods section. Results indicated that NTD and putative C-domain of S1 
were structurally similar for HECV-4408/US/94 and SWE/C/92 (Fig. 3a, b). This similarity is 
clearly illustrated when the two structures are aligned (Fig. 3c). In contrast, the predicted 
structures for SWE/M/10-1, and GER/V270/83 were substantially divergent while DEN/03-3 
shows an intermediate conformation (Fig. 3d-f). Also in the S2 region HECV-4408/US/94 and 
SWE/C/92 differed in conformation compared to the other strains. The residues primarily 
predicted as potential receptor binding sites based on homology with the S protein of SARS were 
used in the generation of structural models. Notably, parts of the putative receptor binding 
domain and of the NTD were found to be in the strong positively selected regions on the surface 


of S1 subunit (Fig. 3g, residues coloured green and red in SWE/C/92). 
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DISCUSSION 


Circulation patterns of BCoV strains 


This is the first evolution study to include full-length S gene sequences of BCoV strains obtained 
from European countries. The twenty-six Swedish and seven Danish BCoV strains sequenced in 
this study show low genetic diversity that result in their clustering as a unique clade in the 
phylogenetic tree (Fig. 1). We show based on the full-length S gene that there are no consistent 
differences between BCoV strains obtained from respiratory and enteric disease. This is in 
accordance with our previous study of partial S sequences (Bidokhti et al., 2012). In two herds, 
identical sequences (e.g. SWE/02-1 and SWE/I/07-3) were found in different cattle sampled at 
the same occasion supporting previous findings that a herd disease outbreak is caused by a 
dominant strain (Bidokhti ef al., 2012; Liu et al., 2006). However, in a large dairy herd (>200 
cows) we found two slightly different (99.8%) CD strains, SWE/N/05-1 and SWE/N/05-2, which 
were circulating at the same time. This finding indicates that strains with genetic diversity, 
though limited, can circulate in such herds. Large dairy herds were previously found to have a 
higher incidence of BCoV infection (Ohlson et al., 2010; Smith et al., 1998) which is consistent 
with the concept that large herds may foster a favorable environment for virus introduction and 


circulation of the strains. 


A high similarity was observed between Italian and Swedish strains. We also identified a high 
similarity (99.4%) between the strain SWE/M/06-3 and six other strains that circulated in 2007 


to 2009 in distant regions of Sweden, implying that certain strains may have the potential to 


12 


270 


271 


Did 


215 


274 


Pb 


276 


Zi} 


278 


279 


280 


281 


282 


283 


284 


285 


286 


287 


288 


289 


290 


291 


292 


spread directly or indirectly to distant regions or to other countries. No identical strains obtained 
from different epidemic seasons have been identified, but some strains were highly similar. High 
stability of certain BCoV strains was shown by the finding of identical strains in Gotland island 
in 2007 (e.g. SWE/I/07-3) and a highly similar strain obtained from another region in 2009 
(SWE/P/09-1). Highly similar strains were also found in different regions in 2010 (SWE/Y/10-3, 
SWE/P/10-4). This suggests that these BCoV strains were part of common transmission chains. 
This data supports previous findings that S gene sequences can provide data to clarify the 


transmission routes of BCoV strains (Bidokhti et al., 2012; Kanno et al., 2013). 


Rate of evolution of BCoV strains 


This evolutionary analysis encompassed a large data set of Betacoronavirus] sequences of full- 
length S gene obtained over 45 years (1965-2010), including newly sequenced Swedish and 
Danish BCoV strains from the last decade and one strain from 1992. Sampling over time 
provides us with heterochronous data to calculate an evolutionary rate and to estimate the time of 
divergence of the recent BCoV sequences. The estimated rate of nt substitution in the S gene of 
BCoV (8.7x10~ substitution /site /year) is comparable to that observed as standard range (orders 
of 10° to 10°) in other rapidly evolving RNA viruses, such as nonstructural protein 2 (NSP2) of 
rotavirus A (Donker & Kirkwood, 2012) and E gene of Dengue virus 3 (Sall et al., 2010). 
TMRCA estimate for BCoV strains in this study compared to published BCoV S gene sequences 
from other countries was 1978 (95%CI: 1974 to 1981). This time period is even shorter than 
expected results reported previously (Vijgen et al., 2006), showing a recent divergence during 


the last 60 years for BCoVs; 1944 (95%CI: 1910 to 1963). This implies the high ability of BCoV 
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to adapt to cattle population and spread over a large geographical region in a relatively short 


period of time. 


Molecular clock analysis of the spike gene of the recent BCoV strains and HCoV-OC43 strains 
estimated an evolutionary rate in the order of 4.1x10~ substitution per site per year, which is 
similar to a previous estimate of 4.7x10~ substitution per site per year (Vijgen et al., 2005b). 
Bayesian coalescent approach dated TMRCA around 1899, highly similar to the previous 
estimate of around 1890 (Vigen et al., 2005b). Evolutionary analysis of our BCoV strains along 
with other virus species in Betacoronavirus| demonstrated a closer relationship of BCoV to 
canine and human CoVs than to porcine and equine CoVs. TMRCA of CoVs is in accordance 
with their clustering in the phylogenetic tree (Fig. 1). The time of divergence of BCoV and 
CRCoV strains was estimated to have occurred five decades after that of BCoV and HCoV- 
OC43 strains, suggesting a closer common ancestor of the former. The spike protein of CRCoV- 
4182/UK/03 has been shown to have a higher genetic similarity to BCoV/Mebus/US/72 and 
BCoV/LY 138/US/65 than to HCoV-OC43/VR759/UK/6 (Erles et al., 2007). In that study, the 
cross-reactivity of CRCoV-4182/UK/03 with polyclonal antisera against BCoV was also shown 
(Erles et al., 2007). This corresponds to what is illustrated in the phylogenetic tree (Fig. 1); the 
clade of ruminant CoVs is clustered closer to the clade of CRCoV strains than to the other virus 
species in Betacoronavirus|. At the tree level, CoVs from bovines and several wild ruminant 
species clustered closely together, implying that such interspecies transmission of CoVs may 


occur as suggested previously (Alekseev et al., 2008). 
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In this study, we reported a close genetic relationship (98.9% nt identity, 98.6% aa identity) and 
high simulated structural similarity of the S protein of HECV-4408/US/94 with a BCoV field 
strain, SWE/C/92. The infectivity of HECV-4408/US/94 for gnotobiotic calves and complete 
cross-protection against BCoV/DB2/84 isolate showing 98.2% aa identity (98.6% nt identity) to 
HECV-4408/US/94 in the S protein has been experimentally confirmed (Han et al., 2006). Thus, 
the similarity between SWE/C/92 and HECV-4408/US/94 S protein conformation further 
supports the hypothesis of possible interspecies transmission of these viruses. Future studies to 
find novel strains of Betacoronavirus| and determination of the structure of the S protein would 


greatly assist in determining how such interspecies transmissions occur. 


Positive selection on the S protein 


The selection profiles identified two main patterns within the subunit domains S1 and S2 of the S 
protein. The S1 subunit is exposed on the surface of the viral particle, and is the target of 
neutralizing antibodies (Deregt & Babiuk, 1987; Yoo & Deregt, 2001; Yoo et al., 1991b). The 
S1 subunit has two domains with a clear positive selection pattern (Fig. 2). Positively selected 
fragments of genes encoding viral proteins exposed on the surface of the capsid have been 
documented in other viruses, such as in porcine circovirus type 2 (PCV2) (Olvera et al., 2007) 
and porcine parvovirus (PPV) (Shangjin et al., 2009). There is an association between positively 
selected sites along SI subunit identified in this study and mapped neutralizing epitopes. 
Epitopic fragments spanning aa residues 324- 720 of the S1 subunit of BCoV and the N-terminus 
of the S2 subunit spanning aa residues 769-798 have been previously recognized using 


monoclonal antibodies (MAbs) (Vautherot et al., 1992a; Yoo et al., 1991b). A polymorphic 
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region spanning aa residues 456- 592 has also been shown by sequence analysis of BCoV strains 
(Rekik & Dea, 1994). It has been reported that mutations in the S1 and the N-terminus of the $2 
sequence often result in changes in antigenicity (Kanno et al., 2013; Vautherot et al., 1992b; Yoo 
& Deregt, 2001). Likewise, parts of the putative receptor binding domain defined in this study 
and the NTD defined in detail in a previous study (Peng et al., 2012) were shown to be under 
strong positive selection in the BCoV strains. Taken together, the strong positively selected 
motifs among the S protein may thus be associated with the immune response and receptor- 
binding and would thus be important in future BCoV vaccine development. The negative 
selection pattern of the S2 subunit is also reported (Fig. 2). Negative selection is usually reported 
in genome fragments with essential functions in the viral lifecycle (Yang, 2005). For example, 
extensive syncytia formation was observed in cells infected with an $2 recombinant of BCoV 
(Yoo et al., 1991a). The structure of the SARS-CoV S2 fusion protein core has been shown to 
provide a framework for the design of entry inhibitors that could be used in the therapeutic 
intervention against this virus (Supekar ef al., 2004). Thus we speculate that the S2 subunit, 
except its N-terminus, would mostly interact with cellular compartments rather than immune 


system elements of the host. 


Vaccination with an inactivated vaccine against BCoV has been used very restrictedly in 
Swedish cattle herds. Thus we conclude that selective pressure sites observed in the receptor 
binding subunit of S protein gene of BCoV strains indicate a natural mode of evolution that is 
mainly due to exposure to the host immune system. Currently available vaccines are based on old 
enteric BCoV strains, genetically and antigenically different from currently circulating BCoV 


strains (Fulton et al., 2013). Thus, continuous monitoring of sequence changes in positive 
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eye) 


selection sites may provide potentially useful data for identifying future dominant epidemic 


strains. This can then help to update the vaccine strains. 


Studies are also warranted to detect the emergence of new genotypes and recombinants of BCoV 
as well as other betacoronaviruses and to assess their significance and potential in causing future 
epidemics. Nevertheless, it should be noted that the sequencing of a single gene may not be 
sufficient to define the genotypes of BCoV, as previously shown for human betacoronaviruses 
(Lau et al., 2011; Woo et al., 2006). Based on the lessons from HCoV-OC43 genotyping (Lau et 
al., 2011) and recent evolutionary evaluation of the diverse genetic BCoV population through 
pioneering in-depth sequencing analysis (Borucki ef al., 2013), the deep sequencing of BCoV 
should therefore be performed to better understand the molecular epidemiology of BCoV, to 


determine genotypes and to reveal possible recombination events. 
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MATERIALS AND METHODS 


Clinical samples. In total, thirty three field samples; 25 fecal and 8 nasal, were sequenced from 
cattle in 29 herds (Table 1) from Sweden and Denmark. Sampled animals in all herds were 
showing clinical signs of BCoV infection. The samples were collected during outbreaks that 
occurred from 2002 to 2010. All seven Danish samples (one nasal and six fecal) were from 2003 
and 2005. The oldest Swedish strain, which was from a WD outbreak in Uppland in 1992, was 
also sequenced. In this study, no cell culture passaged virus was utilized. Samples were kept 


frozen at -70°C until analyzed. 


RNA extraction, cDNA synthesis, primer pairs and PCR. RNA extraction with TRIzol LS 
reagent (Invitrogen) and cDNA synthesis with random priming were performed as described 
previously (Liu et al., 2006). In order to amplify and sequence the S gene (4092 nt), seven pairs 
of primers (Table 2) were used to generate a set of overlapping PCR products encompassing the 
entire S gene. Among these primers, six pairs (AF/AR, BF/BR, CF/CR, DF/DR, GF/GR, 
HF/HR) were already published (Hasoksuz et al., 2002; Jeong et al., 2005), while one pair 


(EF/ER) was designed by our group. 


Amplification of the full-length S gene was performed in a DNA Thermal Cycler (Perkin-Elmer) 
using Pfu Ultra DNA polymerase (Stratagene). Briefly, lul of cDNA was amplified in a 50ul 
reaction containing 5ul of 10xPfu Ultra buffer, lul of 1OmM dNTP, lul of each AF and HR 


primers (10uM), 2.5U of Pfu Ultra DNA polymerase, and 40ul of ddH20. The cycling profiles 
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consisted of 2min of denaturation at 95°C followed by 35 cycles of 95°C for 30s, 50°C for 60s, 


and 72°C for 4min, and a final extension step for 7min at 72°C. 


In order to increase the sensitivity of the PCR detection method, nested and semi-nested PCR (N- 
and SN-PCR) assays were developed as described previously (Bidokhti et al., 2012). Briefly, Syl 
of the first PCR product was added to a tube with 45ul of PCR mixture, comprising 5ul of 10x 
PCR buffer, 11 of 1OmM dNTPs mixture, 541 of Img/ml bovine serum albumin, 1.51 of each 
primer (1OuHM), Sul of 25mM MgCh, 1U of Taq DNA polymerase (AmpliTaq; Perkin-Elmer) 
and 24ul of water. The thermocycling profile included 35 cycles of denaturation at 94°C for 45s, 
annealing at 50°C for 60s, and extension at 72°C for 3min, and a final extension at 72°C for 
7min. For each strain, all seven fragments (A, B, C, D, E, G and H) were amplified by the 


corresponding primer pairs. 


DNA sequencing and genome analysis. All seven PCR products of each strain were purified 
and sequenced in both directions using the same primers as for PCR and an ABI PRISM BigDye 
Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA) as described (Liu 
et al., 2006). Capillary electrophoresis was performed in an ABI 3100 genetic analyzer (Applied 
Biosystems). Sequence chromatograms were aligned and assembled into a final 4092-nt 
fragment of S gene, stretching from nt positions 23641 to 27733 (aa residues | to 1363 of the S 


glycoprotein) of the BCoV strain Mebus. 


Sequences were aligned with the ClustalW program available in the BioEdit Sequence 


Alignment Editor (Hall, 1999). Phylogenetic tree construction was performed from the 
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nucleotide sequences using a Neighbour-Joining (NJ) algorithm with bootstrap values calculated 
from 1,000 replicates in the program MEGA 5 (Tamura et al., 2011). The prediction of the 
receptor binding domain of spike protein was performed using InterProScan (Apweiler et al., 
2001). The prediction of potential N-glycosylation sites in the spike proteins was performed 
using the CBS NetNGlyc 1.0 server (http://www.cbs.dtu.dk/services/NetNGlyc/). Reference 
sequences of virus species of Betacoronavirus! including BCoV, HCoV-OC43, PHEV, ECoV 
and BCoV-like coronaviruses in wild ruminants and also CRCoV were retrieved from GenBank 


and included in this analysis (Table 1). 


Selective pressure analysis. To explore the potential overall differences in selective pressure on 
complete S gene sequences of the Swedish and Danish BCoV strains, we analyzed the 
occurrences of synonymous (dS) and nonsynonymous (dN) substitutions using SNAP server 
available at http://www.hiv.lanl.gov/content/sequence/SNAP/SNAP.html (Korber, 2000), which 
plots the cumulative and per codon occurrence of each type of substitution from start to end of 


the S gene. 


In order to examine the robustness of the positive selections identified by SNAP, we also 
analyzed our datasets using HYPHY package accessed through the Datamonkey facility 
http://www.datamonkey.org (Pond & Frost, 2005). Datamonkey includes random effects 
likelihood (REL) for detecting sites under selection. To detect positively selected sites, default 
significant level of Bayes factor > 50 was used for REL. REL method is often the only method 
that can infer selection from small (5—15 sequences) or low divergence alignments and tends to 


be the most powerful test. This method was run using the GTR substitution model on a neighbor- 
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joining phylogenetic tree by the Datamonkey web server in order to investigate selective 
pressure along S protein of BCoV strains sequenced in this study. Bootscan analysis was also 
used to detect possible recombination using the nucleotide alignment of the S gene sequences of 
virus species in Betacoronaviruses! and also CRCoV. Bootscan analysis was performed using 
Simplot version 3.5.1 as described previously (Lau et al., 2011; Woo et al., 2006), with BCoV, 


HCoV, ECoV, PHEV and CRCoV strains as the query. 


Evolutionary rate and estimation of divergence dates. Rate of evolution and divergence times 
were calculated based on S gene sequence data using a Bayesian Markov chain Monte Carlo 
(MCMC) approach implemented in BEAST v.1.6.2 package (Drummond & Rambaut, 2007). 
Three independent runs of MCMC per dataset were performed under a strict molecular clock 
model, using the Hasegawa—Kishino—-Yano model of sequence evolution with a proportion of 
invariant sites and gamma distributed rate heterogeneity (HK Y+I+I) with partitions into codon 
positions, and the remaining default parameters in the prior’s panel. For the S gene, the MCMC 
run was 3x10’ steps long and the posterior probability distribution of the chains was sampled 
every 1000 steps. Convergence was assessed on the basis of an effective sampling size after a 
10% burn-in using Tracer software, version 1.5 (Rambaut & Drummond, 2007). The estimations 
are the mean values obtained for the three runs. The mean time of the most recent common 
ancestor (t(MRCA) and the 95% confidence interval (CI) were calculated, and the best-fitting 
models were selected by a Bayes factor using marginal likelihoods implemented in Tracer 


(Suchard et al., 2001). 
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In silico model analysis. Based on strain sequence identity and phylogenetic analysis, the aa 
sequences of the S protein of five CoVs were chosen: HECV-4408/US/94 (the human isolate 
most closely related to BCoV) and SWE/C/92 (the oldest Swedish strain clustered with HECV- 
4408/US/94), DEN/03-3 (the strain with highest identity to SWE/C/92), SWE/M/10-1 (the strain 
with lowest identity to SWE/C/92), and GER/V270/83 (a bovine reference isolate from 
Germany). Initially, a metathreading approach was applied in -TASSER (Zhang, 2008; Zhang & 
Skolnick, 2004a), to identify templates for the subjected sequences in a non-redundant protein 
data bank structure library. From the generated consensus threading templates, the fragments of 
the sequences were assembled using modified replica-exchange Monte-Carlo simulations into 
3D models. In order to refine overall topology, models were clustered in SPICKER (Zhang & 
Skolnick, 2004b). A C-score was defined based on the quality of the threading alignments and 
the convergence of parameters of the structure assembly simulations. The structures were 


visualized and annotated in MacPyMol v1.3 (Schrédinger, LLC.). 
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LEGENDS OF FIGURES 


Fig. 1. Neighbor-Joining tree based on the p-distance of the complete nucleotide S sequences of 
virus species Betacoronavirus! containing BCoV strains from Sweden and Denmark sequenced 


in this study. Bootstrap values above 70% for 1,000 iterations are shown at the branch. 


Fig. 2. The distribution of accumulated (a) and per codon (b) positive selection sites identified 
using SNAP server along the S protein of the BCoV strains sequenced in this study. The two 
functionally distinct domains S1 and S2 are marked together with the cleavage site (vertical 
arrow, aa residues 768-769). The first upper line represents the hypervariable regions. The 
regions labeled with asterisk were previously described (Bidokhti et a/., 2012) and the rest were 
found in the study; spanning aa residues 447-596, 718-722, 785-828, 875-888, 1235-1239 and 
1275-1278. The second upper line represents the MAbs binding sites previously described for S1 
subunit (Yoo & Deregt, 2001) and for S2 subunit (Vautherot et al., 1992b) of BCoV; spanning 
aa residues 351-403, 517-621 and 769-798. The third upper line represents receptor binding 
domains previously described; N-terminal domain (NTD) spanning aa residues 15- 298 of BCoV 
(Peng ef al., 2012) and C-domain spanning aa residues 318-510 of SARS-CoV (Wong et al., 
2004). The putative C-domain of the BCoV strains was predicted to span aa residues 326-540 
using InterProScan. The last two lines represent the negative and positive selection motifs based 
on accumulated dN-dS. Thicker arrows show the strong selection motifs as described in the 


results. 


Fig. 3. Predicted 3D structures of S proteins belonging to several strains of coronaviruses 
including HECV-4408/US/94 (a), SWE/C/92 (b), DEN/03-3 (d), SWE/M/10-1 (e) and 
GER/V270/83 (f). In (c) The first two S proteins were aligned using MacPymole, HECV- 
4408/US/94 (red) and SWE/C/92 (cyan). In (g) The cleavage site of the S protein of SWE/C/92 
is labeled yellow (aa residues 768-769), as well as regions of the S protein under positive 
selection (aa residues 109-131 in red and 495-527 in green). The regions (910-1032, 1059-1234 
and 1245-1279) of the S2 subunit under purifying selection are marked cyan. The putative 
receptor binding domain (so called C-domain spanning aa residues 326-540) is colored blue and 


green. 
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722 Table 1. BCoV strains utilized in this study. 
723 
Strain/Isolate name Sampling Sample Sample Country Previous _— Accession 
Year Origin Type Label name’ Number 

SWE/C/92 1992 Adult cattle Fecal Sweden C1-9202 JN795143' 
SWE/02-1 2002 Calf Nasal Sweden Nc1N-02a DQ121634° 
SWE/02-2 2002 Calf Nasal Sweden Nc2N-02 DQ121635 
SWE/02-3 2002 Calf Nasal Sweden Nc3N-02 DQ121637' 
SWE/02-4 2002 Calf Nasal Sweden Nc4N-02 DQ121638 
DEN/O3-1 2003 Calf Fecal Denmark Kc1F-03 DQ121631 
DEN/03-2 2003 Calf Fecal Denmark Ac1F-03 DQ12161 gt 
DEN/03-3 2003 Calf Fecal Denmark Dc1F-03 DQ121 622' 
DEN/OS5-1 2005 Cattle Fecal Denmark This study 
DEN/05-2 2005 Cattle Fecal Denmark This study 
DEN/05-3 2005 Cattle Nasal Denmark This study 
DEN/05-4 2005 Cattle Fecal Denmark This study 
SWE/N/05-1 2005* Calf Fecal Sweden N1-0511 JN795155' 
SWE/N/05-2 2005* Calf Fecal Sweden This study 
SWE/AC/06-1 2006 Adult cattle Fecal Sweden AC1-0611 JN795141' 
SWE/M/06-3* 2006 Calf Fecal Sweden This study 
SWE/M/06-4* 2006 Calf Fecal Sweden M2-0605 JN795154' 
SWE/Z/07-1 2007 Adult cattle Fecal Sweden Z2-0711 JN795163' 
SWE/C/07-2 2007 Adult cattle Fecal Sweden C4-0712 JN795146' 
SWE/I/07-3 2007 ~— Adult cattle © Fecal © Sweden 13-0703 JN795151' 
SWE/I/07-4 2007! Adult cattle Fecal Sweden This study 
SWE/I/07-5 2007 Adult cattle Fecal Sweden This study 
SWE/C/07-6 2007 Adult cattle Fecal Sweden C3-0711 JN795145' 
SWE/AC/08- 1 2008 Adult cattle Fecal Sweden Y1-0801 JN795161' 
SWE/C/08-2 2008 Adult cattle = Fecal Sweden C5-0801 JN795147' 
SWE/I/08-3 2008 Adult cattle Fecal Sweden 14-0810 JN795152' 
SWE/P/09-1 2009 Adult cattle Nasal Sweden P1-0902 JN795159' 
SWE/C/09-2 2009 Calf Nasal Sweden C6-0903 JN795148' 
SWE/U/09-3* 2009 Calf Nasal Sweden U1-0907 JN795160' 
SWE/M/10-1 2010 Calf Fecal Sweden This study 
SWE/M/10-2 2010 Calf Fecal Sweden This study 
SWE/Y/10-3 2010 Calf Fecal Sweden This study 
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SWE/P/10-4 
GER/V270/83 
BCoV/GER/M80844/89 


BCoV/ITA/339/06* 
BuCoV/ITA/179-07-11 
WtDCoV/OH-WD470/94 


BCoV/KWD2/KOR/02 


Nyala/KOR/10-1 
BCoV/KCD2/KOR/04 


BCoV/LSU/94 
BCoV/US/OK-05 14-3/96 


WbCoV/OH-WD358/94 


SDCoV/US/OH-WD388- 
GnC/94 
WbCoV/OH-WD358- 
GnC/94 
SDCoV/OH-WD388/94 


SACoV/OH- 1/03 


BCoV/AH65-E/OH/01 
BCoV/AH65-R/OH/01 
BCoV/ENT/US/98 


GiCoV/OH3/03 
BCoV/AH187-E/OH/2000 


ACoV/OH/98 
BCoV/LUN/US/98 


BCoV/DB2/84 
BCoV/F15/FRA/79 
BCoV/LY 138/US/65 
BCoV/Mebus/US/72 
BCoV/Quebec/ 72 


HCoV-0C43/VR759 
/UK/67 
PHEV/VW572/BEL/72 


PHEV/67N/BEL/70 
HECV-4408/US/94 


BCoV/438/06-2/ITA 
BCoV/Kakegawa/JAP 
HCoV-O0C43/BE03/BEL 


2010 
1983 
1989 


2006 
2007 
1994 


2002 


2010 
2004 


1994 


1996 


1994 
1994 


1994 


1994 
2003 


2001 
2001 
1998 


2003 
2000 


1998 
1998 


1984 
1979 
1965 
1972 
1972 
1967 


1972 
1970 
1994 


2006 
1976 
2003 


Calf 


Calf 

Cattle 
Buffalo calf 
White-tailed 
deer 

Cattle 


Nyala 
Calf 
Cattle 


Cattle 


Waterbuck 
Sambar deer 


Waterbuck 


Sambar deer 


Sable antelope 


Feedlot Calf 
Feedlot Calf 
Cattle 


Giraffe 
Feedlot Calf 


Alpaca 
Cattle 


Cattle 
Cattle 
Cattle 
Cattle 
Human 


Pig 
Piglet 


Human infant 


Feedlot calf 
Cattle 


Human infant 
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Gn calf 


Fecal 
Fecal 


Fecal 
Nasal 
Fecal 


Fecal 
Fecal 


Fecal 
Nasal 


Fecal 
Fecal 


Nasal 


Tonsil 


Fecal 


Nasal 
Fecal 
Nasal 


Sweden 
Germany 
Germany 


Italy 
Italy 
US, Ohio 


South Korea 


South Korea 
South Korea 


US, 
Louisiana 
US, 
Louisiana 
US, Ohio 
US, Ohio 


US, Ohio 


US, Ohio 
US, Ohio 
US, Ohio 
US, Ohio 
US, Texas 
US, Ohio 
US, Ohio 
US, Oregon 
US, Texas 
US, MD 
France 
US, Utah 
US 
Canada 
England 


Belgium 
Canada 


US, 
Louisiana 
Italy 


Japan 
Belgium 


This study 
EF193075 
M80844.1 


EF445634 
EU019216 
FJ425187.1 


AY935638.1 


HM573330.1 
DQ389633 


AF058943 
AF058944 


FJ425186.1 
FJ425190.1 


FJ425185.1 


FJ425189.1 
EF424621.1 


EF424615.1 
EF424617.1 
AF391541 


EF424623.1 
EF424619.1 


DQ915164.2 
AF39 1542 


DQ811784 
D00731 
AF058942 
U00735 
AF220295 
AY391777 


DQO011855.1 
AY078417 
L07748.1 


EU8 14647.1 
AB354579 
AY903454 


HCoV-OC43/BE04/BEL 2004 Human infant Nasal Belgium AY903455 
HCoV-OC43/Paris/01 2001 Adult human Nasal France AY585229 
CRCoV/02/005/JAP 2002 Puppy Nasal Japan AB242262.1 
CRCoV/JAP/07 2007 Dog Nasal Japan AB370269. 1 
CRCoV-4182/UK/03 2003 Puppy Nasal England DQ682406 
CRCoV/240/05/ITA 2005 Dog Nasal Italy EU999954 
ECoV/Tokachi/09/JAP 2009 Horse Fecal Japan BAJ52885.1 
ECoV/NC/99/US 1999 Foal Fecal US, North EF446615.1 
Carolina 

724  * Samples were collected during warm season. 

725‘ Strains were partially sequenced previously and their fragments A and B are available in databases. Other 

726 fragments of these strains were sequenced in this study. 

727  * The label names of strains partially sequenced in our previous studies (Bidokhti et al., 2012; Liu et al., 

728 2006) are designated here. 

729  * Samples were collected from same farm in November 2005. 

730 ‘Samples were collected from same farm in March 2007. 
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Table 2. S gene and reference of primer pairs used in this study. 


Pruner Primer sequence (5’3?’) Benner pumer 

name location reference 

AF’ 5’-ATG TTT TTG ATA CTT TTA ATT-3’ 1-21 (Hasoksuz et al., 2002) 
AR? 5’-AGT ACC ACC TTC TTG ATA AA-3’ 654-635 (Hasoksuz et al., 2002) 
BF 5’-ATG GCA TTG GGA TAC AG-3’ 549-565 (Hasoksuz et al., 2002) 
BR 5’-TAA TGG AGA GGG CAC CGA CTT-3’ 1039-1018 (Hasoksuz et al., 2002) 
CF 5’-GGG TTA CAC CTC TCA CTT CT-3’ 782-801 (Hasoksuz et al., 2002) 
CR 5’-GCA GGA CAA GTG CCT ATA CC-3’ 1550-1531 (Hasoksuz et al., 2002) 
DF 5’-GTC CGT GTA AAT TGG ATG GG-3’ 1460-1479 = (Hasoksuz et al., 2002) 
DR 5’-TGT AGA GTA ATC CAC ACA GT-3’ 2286-2267 (Hasoksuz et al., 2002) 
EF 5’-GAA CCA GCA TTG CTA TTT CGG A -3’ 2109-2131 This study 

ER 5’-TTA TAA CTT TGC ACA CAA ATG AGG TC-3’ 2876-2851 This study 

GF 5’-CCC TGT ATT AGG TTG TTT AG-3’ 2691-2710 (Jeong et al., 2005) 

GR 5’-ACC ACT ACC AGT GAA CAT CC-3’ 3606-3587 (Jeong et al., 2005) 
HF 5’-GTG CAG AAT GCT CCA TAT GGT-3’ 3439-3459 (Jeong et al., 2005) 
HR 5’-TTA GTC GTC ATG TGA TGT TT-3’ 4092-4073 (Jeong et al ., 2005) 


"BR: Forward primer. 
*R: Reverse primer. 
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737 Table 3. Mean estimations for the rate of evolution and TMRCA of the Swedish and Danish 
738 BCoV strains in comparison with the reference strains in Betacoronavirus]. 
739 


Reference strains BCoV strains 
Mean rate of evolution TMRCA 
substitution /site /year (x 10°) 
Human (HEC-4408) 8.3 (6.7 - 9.9)" 1977 (1975-1980) 
BCoV reference strains 8.7 (7.0 - 10.5) 1978 (1974 - 1981) 
Wild ruminants 4.4 (3.2 - 5.7) 1963 (1954-1970) 
Canine (CRCoV) 4.4 (3.2 - 5.5) 1951 (1939-1961) 
Human (HCoV-OC43) 4.1 (3.2 - 4.7) 1899 (1884-1915) 
Porcine (PHEV) 7.6 (6.0 - 9.3) 1847 (1815 - 1875) 
Equine 7.9 (6.2 - 9.9) 1797 (1752-1844) 
740 * 95% confidence interval (CI) values are between brackets. 
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Table 4. REL analysis results for the S protein of the BCoV sequence strains. 


No. of Mean No. of positively Posterior PoGiively-sclectad sites. 
sequences dN-dS' _ selected sites Probability y 
30° 2.04 39 >90% 35, 112, 113, 115, 143, 147, 151, 


157, 188, 257, 447, 458, 471, 482, 
499, 501, 503, 510, 523, 525, 543, 
546, 573, 578, 590, 596, 718, 722, 
805, 828, 881, 883, 888, 1034, 
1120, 1206, 1237, 1239, 1278 


“Three identical sequences were excluded from analysis. 

"Because dS could be 0 for some sites, Datamonkey reports dN-dS in place of dN/dS. 

* Positively selected sites identified with posterior probability p > 95% are in boldface. The 
underlined ones are also reported by SNAP. 
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